<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<title>Matlab routines for Linear Predictive Coding (LPC)</title>
</head>
<body link="#0000FF" vlink="#800080">
<h1>Matlab routines for Linear Predictive Coding (LPC)</h1>
<hr>
<p>Return to <a href="voicebox.html">voicebox home page</a></p>
<hr>
<h2>Data Format</h2>
<p>All the LPC routines described in this section can process several frames
together. Each frame corresponds to a single row of the data matrix; if there
is only one frame then the matrix must be a row vector rather than a column
vector.</p>
<h2>LPC Analysis</h2>
<p>Two routines are provided: <a href="mdoc/v_mfiles/v_lpcauto.html">lpcauto</a> for
autocorrelation analysis and <a href="mdoc/v_mfiles/v_lpccovar.html">lpccovar</a> for
covariance analysis.</p>
<p>The analysis order, <i>p</i>, denotes the number of poles in the resultant
autoregressive filter. The appropriate value for <i>p</i> is typically
2+<i>f<sub>s</sub></i>/1kHz where <i>f<sub>s</sub></i> is the sample frequency.
This expression assumes that sound takes about = ms to travel the length of the
vocal tract.</p>
<h3>Fancy versions of LPC</h3>
<p>Although the analysis order must be the same for all frames, the individual
frame lengths can vary; this allows <i>pitch-synchronous</i> analysis. It is
also possible to restrict the analysis interval to particular segments of each
frame to allow <i>closed-phase</i> analysis. For high pitched voices, the
closed phases may be very short: it is possible in this case to combine the
data from two or more consecutive cycles to give <i>multi-cycle
closed-phase</i> analysis. To obtain reliable estimates of the AR coefficients
must be based on at least 2 ms of data.</p>
<h2>LPC Coefficient Representations</h2>
<p>The coefficients generated by LPC analysis can be represented in many
equivalent forms. Voicebox recognizes the coefficient sets listed below and
denotes each with a two-letter mnemonic. The number of coefficients varies: for
an analysis of order <i>p</i> there can be <i>p</i>, <i>p</i>+1, or <i>p</i>+2
coefficients. This is indicated in the table. The meaning of the coefficient
sets is explained below with reference to the lossless tube model of speech
production. Routines are provided to convert each representation to the other
forms indicated: the routine that converts from representation <em>xx</em> to
representation <em>yy</em> is called lpc<em>xx</em>2<em>yy</em>.</p>
<p>The routine <a href="mdoc/v_mfiles/v_lpcconv.html">lpcconv</a> can be used to figure out
the sequence of calls needed to convert between any pair of these
representations.</p>
<table border="0" cellpadding="6" cellspacing="0" width="100%">
<tr>
<td valign="top" width="50"><b>Code</b></td>
<td valign="top" width="70"><b>Size</b></td>
<td valign="top" width="120"><b>Convert from</b></td>
<td valign="top" width="120"><b>Convert to</b></td>
<td valign="top"><b>Description</b></td>
</tr>
<tr>
<td valign="top" width="50">aa</td>
<td valign="top" width="70"><i>p</i>+2</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcdl2aa.html">dl</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2aa.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcaa2ao.html">ao</a>, <a href=
"mdoc/v_mfiles/v_lpcaa2dl.html">dl</a>, <a href="mdoc/v_mfiles/v_lpcaa2rf.html">rf</a></td>
<td valign="top">The <i>area coefficients</i> represent the cross-sectional
areas of the vocal tract segments. The areas are normalised so that
aa(<i>p</i>+2), the effective area of the free space beyond the lips, is equal
to 1. aa(1) is the area at the glottis and is usually near 0.</td>
</tr>
<tr>
<td valign="top" width="50">am</td>
<td valign="top" width="70">(<em>p</em>+1)<sup>2</sup></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2am.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcrr2am.html">rr</a></td>
<td valign="top" width="120"></td>
<td valign="top">An upper unit-triangular matrix containing the AR coefficients
for all orders 0,...,<em>p</em>. This matrix is a diagonal multiple of the
hermitian square root of the symmetric toeplitz matrix toeplitz(rr).</td>
</tr>
<tr>
<td valign="top" width="50">ao</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcaa2ao.html">aa</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2ao.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcao2rf.html">rf</a></td>
<td valign="top">The <i>area ratios</i> give the ratio of one tube segment to
that of the following segment.</td>
</tr>
<tr>
<td valign="top" width="50">ar</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpccc2ar.html">cc</a>, <a href=
"mdoc/v_mfiles/v_lpcim2ar.html">im</a>, <a href="mdoc/v_mfiles/v_lpcls2ar.html">ls</a>, 
<a href="mdoc/v_mfiles/v_lpcar2ra.html">ra</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2ar.html">rf</a>, <a href="mdoc/v_mfiles/v_lpcrr2ar.html">rr,</a> <a href=
"mdoc/v_mfiles/v_lpczz2ar.html">zz</a> <a href="mdoc/v_mfiles/v_lpcrr2ar.html"></a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2am.html">am</a>, <a href=
"mdoc/v_mfiles/v_lpcar2cc.html">cc</a>, <a href="mdoc/v_mfiles/v_lpcar2db.html">db</a>, <a href=
"mdoc/v_mfiles/v_lpcar2ff.html">ff</a>, <a href="mdoc/v_mfiles/v_lpcar2im.html">im</a>, <a href=
"mdoc/v_mfiles/v_lpcar2ls.html">ls</a>, <a href="mdoc/v_mfiles/v_lpcar2pf.html">pf</a>, <a href=
"mdoc/v_mfiles/v_lpcar2pp.html">pp</a>, <a href="mdoc/v_mfiles/v_lpcar2ra.html">ra</a>, <a href=
"mdoc/v_mfiles/v_lpcar2rf.html">rf</a>, <a href="mdoc/v_mfiles/v_lpcar2rr.html">rr</a>, <a href=
"mdoc/v_mfiles/v_lpcar2zz.html">zz</a></td>
<td valign="top">The <i>autoregressive coefficients</i> or <i>AR
coefficients</i> represent the transfer function from the output flow of the
vocal tract to the input flow. The coefficients are usually normalised so that
ar(1)=1.</td>
</tr>
<tr>
<td valign="top" width="50">cc</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2cc.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcpf2cc.html">pf</a>, <a href="mdoc/v_mfiles/v_lpcpf2cc.html"></a><a href=
"mdoc/v_mfiles/v_lpczz2cc.html">zz</a></td>
<td valign="top" width="120"><a href=
"mdoc/v_mfiles/v_lpccc2ar.html">ar</a></td>
<td valign="top">The <i>complex cepstrum coefficients</i> are actually real
despite their name. They equal the inverse fourier transform of the log
frequency response of the autoregressive filter. These coefficients do not
include cc(0) which is the DC component of the log frequency response.</td>
</tr>
<tr>
<td valign="top" width="50">cw</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcpp2cw.html">pp</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpccw2zz.html">zz</a></td>
<td valign="top">The roots of the power spectrum polynomial <i>pp</i>. These
are the, normally complex, values of cos(w) that make the power spectrum of the
inverse filter equal to zero.</td>
</tr>
<tr>
<td valign="top" width="50">db</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2db.html">ar</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcdb2pf.html">pf</a></td>
<td valign="top">The <i>power spectrum</i> of the AR filter expressed in
decibels. The first and last elements of ff() are respectively the DC and
nyquist terms.</td>
</tr>
<tr>
<td valign="top" width="50">dl</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcaa2dl.html">aa</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcdl2aa.html">aa</a></td>
<td valign="top">The <i>discrete cosine transform</i> of the log
cross-sectional area function of the tube.</td>
</tr>
<tr>
<td valign="top" width="50">ff</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2ff.html">ar</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcff2pf.html">pf</a></td>
<td valign="top">The <i>complex frequency response</i> of the AR filter. The
first and last elements of ff() are respectively the DC and nyquist terms.</td>
</tr>
<tr>
<td valign="top" width="50">im</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2im.html">ar</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcim2ar.html">ar</a></td>
<td valign="top">The <i>impulse response</i> of the autoregressive filter.</td>
</tr>
<tr>
<td valign="top" width="50">is</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcrf2is.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcis2rf.html">rf</a></td>
<td valign="top">The <i>inverse sine</i> coefficients equal sin<sup>-1</sup> of
the reflection coefficients multiplied by 2/pi to force them to lie in the
range +-1 for a stable filter.</td>
</tr>
<tr>
<td valign="top" width="50">la</td>
<td valign="top" width="70"><i>p</i>+2</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcrf2la.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcla2rf.html">rf</a></td>
<td valign="top">The <i>log area coefficients</i> are the log cross sectional
areas of the vocal tract segments. la(<i>p</i>+2) is the log of the effective
area of the free space beyond the lips and is normalised to 0.</td>
</tr>
<tr>
<td valign="top" width="50">lo</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcrf2lo.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpclo2rf.html">rf</a></td>
<td valign="top">The <i>log area ratios</i> give the log of the ratio of one
tube segment to that of the following segment. These values are limited by the
conversion routines to about +-14.5</td>
</tr>
<tr>
<td valign="top" width="50">ls</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2ls.html">ar</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcls2ar.html">ar</a></td>
<td valign="top">The <i>line spectrum frequencies</i> or <i>line spectrum
pairs</i> are normalised frequencies in the range 0 to 0.5. A sharp peak in the
AR filter response will give rise to a pair of line spectrum frequencies nearby
the peak.</td>
</tr>
<tr>
<td valign="top" width="50">pf</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2pf.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcdb2pf.html">db</a>, <a href="mdoc/v_mfiles/v_lpcff2pf.html">ff</a>, <a href=
"mdoc/v_mfiles/v_lpcra2pf.html">ra</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcpf2cc.html">cc</a>, <a href=
"mdoc/v_mfiles/v_lpcpf2rr.html">rr</a></td>
<td valign="top">The <i>power spectrum</i> of the AR filter. The first and last
elements of ff() are respectively the DC and nyquist terms.</td>
</tr>
<tr>
<td valign="top" width="50">pp</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2pp.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcra2pp.html">ra</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcpp2cw.html">cw</a></td>
<td valign="top">The <i>power spectrum polynomial coefficients</i>. This
polynomial gives the power spectrum of the all-zero inverse filter as a
function of cos(w).</td>
</tr>
<tr>
<td valign="top" width="50">ra</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2ra.html">ar</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcra2pf.html">pf</a>, <a href=
"mdoc/v_mfiles/v_lpcra2pp.html">pp</a>, <a href="mdoc/v_mfiles/v_lpcra2ar.html">ar</a></td>
<td valign="top">The <i>autocorrelation coefficients</i> of the inverse
filter's impulse response (the inverse filter is an FIR filter).</td>
</tr>
<tr>
<td valign="top" width="50">rf</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcaa2rf.html">aa</a>, <a href=
"mdoc/v_mfiles/v_lpcar2rf.html">ar</a>, <a href="mdoc/v_mfiles/v_lpcis2rf.html">is</a>, <a href=
"mdoc/v_mfiles/v_lpcla2rf.html">la</a>, <a href="mdoc/v_mfiles/v_lpclo2rf.html">lo</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcrf2aa.html">aa</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2ao.html">ao</a>, <a href="mdoc/v_mfiles/v_lpcrf2ar.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2is.html">is</a>, <a href="mdoc/v_mfiles/v_lpcrf2la.html">la</a>, <a href=
"mdoc/v_mfiles/v_lpcrf2lo.html">lo</a>, <a href="mdoc/v_mfiles/v_lpcrf2rr.html">rr</a></td>
<td valign="top">The <i>reflection coefficients</i> give the relative
amplitudes of the incident and reflected pressure waves at the junction between
two tube segments. The direction of travel of the incident wave is from the
glottis towards the lips. rf(1) is the reflection coefficient at the glottis
and rf(<i>p</i>+1) is the reflection coefficient at the lips: both of these
coefficients are normally close to 1. Reversing the order of the reflection
coefficients leaves the tube transfer function unchanged.</td>
</tr>
<tr>
<td valign="top" width="50">rr</td>
<td valign="top" width="70"><i>p</i>+1</td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2rr.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpcpf2rr.html">pf</a>, <a href="mdoc/v_mfiles/v_lpcrf2rr.html">rf</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcrr2am.html">am</a>, <a href=
"mdoc/v_mfiles/v_lpcrr2ar.html">ar</a></td>
<td valign="top">The <i>autocorrelation coefficients</i> of the autoregressive
filter's impulse response when extended to an infinite number of terms.</td>
</tr>
<tr>
<td valign="top" width="50">ss</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpczz2ss.html">zz</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcss2zz.html">zz</a></td>
<td valign="top">The <i>s-plane autoregressive poles</i> are the roots of the
<i>AR</i> coefficient polynomial mapped onto the s-plane and expressed in
normalised Hz. If ss() is multiplied by the sample frequency, a formant with
frequency <i>f</i> and bandwidth <i>b</i> will give an s-plane pole of
approximately _b/2 1 jf.</td>
</tr>
<tr>
<td valign="top" width="50">zz</td>
<td valign="top" width="70"><i>p</i></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpcar2zz.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpccw2zz.html">cw</a>, <a href="mdoc/v_mfiles/v_lpcss2zz.html">ss</a></td>
<td valign="top" width="120"><a href="mdoc/v_mfiles/v_lpczz2ar.html">ar</a>, <a href=
"mdoc/v_mfiles/v_lpczz2cc.html">cc</a>, <a href="mdoc/v_mfiles/v_lpczz2ss.html">ss</a></td>
<td valign="top">The <i>z-plane autoregressive poles</i> are the roots of the
<i>AR</i> coefficient polynomial.</td>
</tr>
</table>
<p>&nbsp;</p>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-6956824-1");
pageTracker._trackPageview();
} catch(err) {}</script>
</body>
</html>
